A DOP Model for Semantic Interpretation
نویسندگان
چکیده
In data-oriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis , using corpora with syntactic annotations such as the Penn Tree-bank. If a corpus with semantically annotated sentences is used, the same approach can also generate the most probable semantic interpretation of an input sentence. The present paper explains this semantic interpretation method. A data-oriented semantic interpretation algorithm was tested on two semantically annotated corpora: the English ATIS corpus and the Dutch OVIS corpus. Experiments show an increase in semantic accuracy if larger corpus-fragments are taken into consideration. 1 Introduction Data-oriented models of language processing embody the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract grammar rules. Such models therefore maintain large corpora of linguistic representations of previously occurring utterances. When processing a new input utterance, analyses of this utterance are constructed by combining fragments from the corpus ; the occurrence-frequencies of the fragments are used to estimate which analysis is the most probable one. For the syntactic dimension of language, various instantiations of this data-oriented processing or "DOP" approach have been worked out (e.g. A method for extending it to the semantic domain was first introduced by van den Berg et al. (1994). In the present paper we discuss a computationally effective version of that method, and an implemented system that uses it. We first summarize the first fully instantiated DOP model as presented in Bod (1992-1993). Then we show how this method can be straightforwardly extended into a semantic analysis method, if corpora are created in which the trees are enriched with semantic annotations. Finally, we discuss an implementation and report on experiments with two semantically analyzed corpora (ATIS and OVIS). 2 Data-Oriented Syntactic Analysis So far, the data-oriented processing method has mainly been applied to corpora with simple syntactic annotations, consisting of labelled trees. Let us illustrate this with a very simple imaginary example. Suppose that a corpus consists of only two trees: S S NP VP NP VP Det N sings Det N whistles I I I I every woman a man Figure 1: Imaginary corpus of two trees We employ one operation for combining subtrees, called composition, indicated as o; …
منابع مشابه
A Data-Oriented Approach to Semantic Interpretation
In Data-Oriented Parsing (DOP), an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new input sentence is constructed by combining sub-analyses from the corpus in the most probable way. This approach has been succesfully used for syntactic analysis, using corpora with syntactic annotations such as the Penn Treebank. If a corpus with semantically annotat...
متن کاملA DOP model for Lexical-Functional Grammar
It is well-known that there exist many syntactic and semantic dependencies that are not reflected directly in a surface tree. All modern linguistic theories propose more articulated representations and mechanisms in order to characterize such linguistic phenomena. The Tree-DOP model is thus limited in that it cannot account for these phenomena. In this chapter, we show how a DOP model can be de...
متن کاملA Data-Oriented Approach to Semantic Interpretation1
In Data-Oriented Parsing (DOP), an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new input sentence is constructed by combining sub-analyses from the corpus in the most probable way. This approach has been succesfully used for syntactic analysis, using corpora with syntactic annotations such as the Penn Treebank. If a corpus with semantically annotat...
متن کاملSpoken Dialogue Interpretation with the DOP Model
We show how the DOP model can be used for fast and robust processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates over ordinary telephone lines. The prototype system is the immediate goal of the NWO 1 Priority Programme "Language a...
متن کاملContext-sensitive spoken dialogue processing with the DOP model
We show how the DOP model can be used for fast and robust context-sensitive processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates over ordinary telephone lines. The prototype system is the immediate goal of the NWO Priority Progr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997